# Cross-modal translation
Seamless M4t V2 Large
SeamlessM4T is a large-scale multilingual multimodal machine translation model supporting speech and text translation in nearly 100 languages.
Text-to-Audio Supports Multiple Languages
S
audo
39
17
Vit Roberta Fa Image Captioning Flickr30k
A Persian image captioning model based on ViT+RoBERTa architecture, specifically designed to generate Persian text descriptions from images
Image-to-Text Other
V
hezarai
85
1
Opus Mt Ase En
Apache-2.0
Transformer-based machine translation model for translating American Sign Language (ASE) to English (EN)
Machine Translation
Transformers Supports Multiple Languages

O
Helsinki-NLP
16
0
Featured Recommended AI Models